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Description 

[0001] The present invention relates to a continuous sign language recognition apparatus and method More partic- 
ularly, the invention relates to a technique of generating reference sign language patterns used for the recognition of 
5 continuous sign language patterns through the pattern matching between a continuous sign language and reference 
sign language. The term "continuous sign language pattern" used in this specification includes also a template pattern. 
[0002] The present invention also relates to a technique of recognizing a series of continuous sign language patterns 
which are contained in a sign language and are similar to reference sign language patterns. 

[0003] The present invention also relates to a sign language translation system in which a recognized sign language 
io is transferred in the system in the form of texts, voices, and sign languages of another type. 

[0004] As conventional techniques regarding sign recognition, there have been proposed "Hand Motion Recognition 
Apparatus and Sign Language Translation System" in JP-A-2-1 44675 (first conventional technique) and "Hand Motion 
Recognition Method using Neuro-computer" in JP-A-3-1 86979 (second conventional technique). According to the first 
conventional technique, colored gloves are used to obtain the positional relation between fingers by an image recogni- 
15 tion technique. This positional relation is matched with pre-stored finger spelling patterns to recognize each finger spell- 
ing. According to the second conventional technique, the correspondence between finger shape data inputted from 
glove-like means and the meaning of the finger shape is learnt by a neural network, and an output obtained when input- 
ting the finger shape data is inputted to the network is used as the recognized finger spelling. 

[0005] "A Hand Gesture Recognition Method and Its Application" has been proposed by Takahashi and Kishino in 
20 Systems and Computers in Japan 2£(3) pp 1985-1992 (1992). In this method, finger shape data is inputted from glove- 
like means and hand motion data is inputted via a 3D digitiser. When assessing hand signals involving motion, the hand 
shape at the start is used as the input. The pattern of the hand shape is not measured as a time sequential series. In a 
descrimination test, approximately 50% of hand gestures tested were correctly identified. 

[0006] A reference pattern to be used for the matching between continuous patterns and reference patterns has been 
25 obtained heretofore by linearly normalizing sample patterns for the reference pattern in the time axis direction and by 
simply averaging these normalized sample patterns. 

[0007] "Motion Recognition Apparatus using Neuro-computer" has also been proposed in JP-A-4-51372 (third con- 
ventional technique). According to the third conventional technique which is an improved version of the second conven- 
tional technique, a three-dimensional motion is time sequentially detected by using a recurrent type neural network to 

30 recognize the meaning of one motion (e.g., of a sign language). 

[0008] A continuous DP matching scheme has also been proposed as the recognition method through the matching 
between time sequential continuous patterns and reference patterns ("Continuous Word Recognition using Continuous 
DP", by OKA, the Speech Study Group of Acoustical Society of Japan, S78-20, pp.145 to 152, 1978) (fourth conven- 
tional technique). According to the fourth conventional technique, continuous voice patterns are sequentially matched 

35 with reference voice patterns while moving the latter in the time axis direction to recognize reference voice patterns con- 
tained in the continuous voice patterns. This matching result is a time sequence of similarities between the continuous 
patterns and reference patterns. A minimum value of the similarities at a threshold level or lower is searched from the 
similarity time sequence for each reference pattern, and the time at the minimum value is used to identify a reference 
pattern candidate. 

40 [0009] The first and second conventional techniques for the sign language recognition are mainly directed to finger 
spellings which are generally static patterns. It is therefore unable to recognize a usual sign language with complicated 
motions of fingers and hands. 

[001 0] if a reference pattern is obtained from an average of sample sign language patterns by linearly expanding/com- 
pressing in the time domain and normalizing them without considering nonlinear expansion/compression, the resultant 
45 reference pattern becomes damped and does not reflect the characteristics of original sample patterns as shown in Fig 
4A. 

[0011] With the third conventional technique, time sequential data of more or less dynamic can be recognized by 
using a recurrent type neutral network. Although it is possible to recognize time sequential data representing one 
motion, it is difficult to properly cut out each sign language word from a series of words which are often used in a prac- 
50 tical sign language. 

[001 2] With the fourth conventional technique for the matching between continuous patterns and reference patterns, 
if data of both the continuous patterns and reference patterns sampled at a predetermined timing is used as it is, the 
time required for the matching increases in proportion to the length of the continuous patterns and the number of refer- 
ence patterns. 

55 [001 3] Other issues associated with sign language translation are as follows. 

(1) There is a difference, between persons, of finger shapes, hand positions, and their motions. According to the 
teaching of voice recognition, voices of a known person can be recognized more easily than voices of unknown per- 
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sons. In the case of finger spellings, the number of finger spellings is as small as 50 words. It is therefore posstole 
to register finger spellings of a particular person, or to learn the weight coefficients of a neural network dedicated 
to a particular person. However, in the case of a sign language, the number of basic words is as large as 1000 
words or more. Therefore, the registration or learning for a particular person is impossible. 
5 (2) Generally, the case where a reference to or a storage of past conversations of a sign language is desired, 

occurs frequently. However, this function has not been realized as yet. 

(3) There are less sign words accompanied with emotions. From this reason, the facial expression or large body 
motion has been used. However, a normal person generally concentrates on the recognition of a sign language 
only, and the facial expression or large body motion is often disregarded. Accordingly, in order to realize a natural 
io speech, it is necessary for a sign language translation system to provide a function of translating a sign language 

with emotion. 

[0014] It is a first object of the present invention to provide an apparatus and method of sequentially recognizing ref- 
erence sign language patterns contained in general continuous sign language patterns represented by the motions of 
15 fingers and hands. 

[001 5] It is a further object of the present invention to provide a sign language translation system to be used by a plu- 
rality of unknown users, capable of recognizing a continuous sign language containing a number of words and having 
different characteristics, and transferring the recognized sign language in the form of various communication types. 
[0016] According to the present invention, there is provided a sign language translation system having input means 
20 for inputting a continuously expressed sign language as time sequential data, means for recognizing a sign language 
from said time sequential data, and means for translating said recognized sign language into a spoken language, said 
sign language recognition means comprising: 

a sign language word dictionary for storing first sign language time sequential data of each sign language word; 
25 means for calibrating second sign language time sequential data inputted from said input means so as to make said 

first sign language time sequential data stored in said sign language word dictionary correspond to said second 
sign language time sequential data; 

means for matching said second sign language time sequential data calibrated by said calibration means with said 
first sign language time sequential data stored in said sign language word dictionary, and recognizing the sign lan- 
30 guage word corresponding to said second sign language time sequential data; 

means for inputting a portrait of a sign language user, look recognition means for recognizing the look of the portrait 
to obtain the emotion type and the emotion degree (intensity), and processor means for inputting a spoken lan- 
guage outputted from said spoken language translation means and the emotion type and degree outputted from 
said look recognition means, and outputting a spoken language added with emotional adjectives. 

35 

[001 7] A reference sign language pattern which may be used in the present invention can be generated as an average 
of correspondence points between sample sign language patterns, the correspondence points being obtained by a DP 
matching after normalizing the sample patterns, while considering the nonlinear compression/expansion of the sample 
patterns. 

40 [001 8] Reference sign language patterns contained in continuous sign language patterns can be recognized by a con- 
tinuous DP matching between each continuous sign language pattern and reference sign language patterns while 
allowing a nonlinear expansion/compression. 

[0019] The characteristic points are derived from a sign language pattern. These points can be derived as the time 
when the velocity of an n-th order vector constituting the sign language pattern becomes minimum, the time when the 

45 direction change of the velocity vector exceeds a threshold value, the time when an accumulated value of direction 
changes of the velocity vector exceeds a threshold value, and the time when the acceleration becomes minimum. In 
accordance with all or some of these characteristic points, the sign language pattern can be efficiently compressed 
without damaging the characteristics of the sign language pattern, by using all or some of pattern vectors at respective 
characteristic points, linear approximations of vectors between characteristic points, and time lengths between charac- 

so teristic points. Each compressed continuous sign language pattern can be matched directly with the compressed refer- 
ence sign language patterns, reducing the recognition time. 

[0020] The normalization in the DP matching considering a nonlinear compression/expansion allows to generate a 
reference sign language pattern without damaging the characteristics of the sign language patterns. 
[0021] A general sign language with motions of fingers and hands can be continuously recognized by a continuous 
55 DP matching between the continuous sign language patterns and reference sign language patterns. 

[0022] Furthermore, a sign language pattern can be compressed in accordance with the characteristic points specific 
to the pattern, resulting in an efficient compression without damaging the characteristics. The compressed continuous 
sign language pattern may be directly matched with the compressed reference sign language patterns, providing a high 
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speed recognition. 

[0023] Still further, to he used in the present invention there are provided a sign language word dictionary for storing 
first sign language time sequential data of each sign language word as sign language word dictionary data, calibration 
means for calibrating second sign language time sequential data inputted as the continuous motions of fingers and 

5 hand, so as to make the second sequential data correspond to the first sign tanguage time sequential data, and sign 
language translation means for matching an output of the calibration means with the sign language word dictionary data 
and recognizing the sign language data corresponding to the inputted second sign language time sequential data. 
[0024] There are also provided matching means for receiving the second sign language time sequential data of a sign 
language word and the sign language word dictionary data corresponding to the second sign language time sequential 

10 data, obtaining and outputting correspondences between both the data at each timing, and selection means for select- 
ing one of the inputted second sign language time sequential data and an output of the matching means, and outputting 
the selected one to the caiib ration means, wherein the calibration means learns the recognition parameters for the cal- 
ibration in accordance with an output of the matching means. 

[0025] A neural network is further provided for learning the recognition parameters for the calibration. 

is [0026] There is also provided spoken language translation means for adding a dependent word to the sign language 
word outputted from the sign language translation means, in accordance with a rule, and outputting a spoken language. 
[0027] There are also provided means for inputting a portrait of a sign language user, and look recognition means for 
recognizing the look of the portrait to obtain the emotion type such as delight and grief and the emotion degree (inten- 
sity), wherein a spoken language added with emotional adjectives is outputted by using a spoken language outputted 

20 from the spoken language translation means and the emotion type and degree outputted from the look recognition 
means. 

[0028] A sign language translation system may be installed at each of a plurality of stations connected to a local area 

network so that information can be transferred between a plurality of sign language translation systems. 

[0029] Sign language time sequential data inputted from glove-like data input means may be matched with each sign 

25 language word dictionary data while moving the latter in the time domain. The sign language word dictionary data can 
be matched with the time sequential data at respective maximum similarities (indicated by broken lines in Fig. 19). A 
characteristic of a sign language specific to each user, i.e. a difference between inputted some sign language word data 
and the corresponding sign language word dictionary data, can be learnt as a data conversion rule (calibration rule). By 
using the conversion rule, the sign language can be converted into data more similar to the sign language word diction- 

30 ary data, improving the recognition precision. 

[0030] With the look recognition means, the emotion expression can be translated realizing a more natural conversa- 
tion. 

[0031] In the drawings: 



Fig. 1 shows the structure of a sign language recognition apparatus using a sign language recognition unit 2 which 
may be used in an embodiment of the present invention; 

Fig. 2 is a diagram showing the structure of a reference sign language pattern generating unit 26 of the sign lan- 
guage recognition unit; 

Fig. 3 is a diagram illustrating start/end points fixed DP matching; 

Rg. 4 A is a diagram showing an average pattern when the time axis is linearly normalized; 

Fig. 4B is a diagram showing an average pattern when correspondence points between sample patterns are 
obtained by DP matching; 

Rg. 5 is a diagram showing the structure of a pattern compression unit 266 of the reference sign language pattern 
generating unit 26; 

Fig. 6 is a diagram showing the manner how the input sign language pattern is compressed; 

Fig. 7 is a block diagram of a continuous sign language recognition unit 28 of the sign language recognition unit 2; 

Fig. 8 is a diagram showing an example of DP path used by general continuous DP matching; 

Fig. 9 is a diagram illustrating the principle of continuous DP matching; 

Fig. 10 is a diagram showing DP paths used in continuous DP matching of a compressed sign language pattern; 
Fig. 1 1 is a diagram illustrating the principle of continuous DP matching of a compressed sign language pattern; 
Fig. 12 is a diagram illustrating the correspondence between characteristic points by continuous DP matching; 
Fig. 13 is a flow chart illustrating the operation of reference pattern generation; 
Fig. 1 4 is a flow chart illustrating the operation of a pattern compression process; 
Fig. 1 5 is a flow chart illustrating the operation of a sign language recognition process; 
Fig. 1 6 is a diagram showing the whole structure of a sign language translation system 100; 
Fig. 17 is a diagram showing a sign language recognition unit 2 which may be used in an embodiment of the inven- 
tion; 

Fig. 1 8 is a diagram showing the structure of conversion unit 21 of the sign language recognition unit 2; 
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Fig. 19 is a diagram explaining the operation of a matching unit (1 ) 24 of the sign language recognition unit 2 
Fig. 20 |s a diagram explaining the operation of a matching unit (2) 23 of the sign language recognrtion unit (2) 23; 
Fig. 21 is a diagram showing the structure of look recognition unit 5; 

Fig. 22 is a diagram showing the structure of partial image position detection and cut-out unit 41 of the look recoa- 
5 nition unit 5; 

Fig. 23 is a diagram showing the structure of look matching unit 423 of the look recognition unit 5- 

Fig. 24 is a diagram showing the structure of sign language CG generating unit; and 

Fig. 25 is a diagram showing an example of the structure of another sign language translation system. 

io [0032] Fig. 1 is a block diagram of a continuous sign language recognrtion apparatus which may be used in an embod- 
iment of the present invention. 

[0033] In Fig. 1 reference numeral 1 represents data gloves for inputting a sign language to a computer (Data Glove 
is a reg.stercd trade mark of VPL Research Lab.. U.S.A.). Reference numeral V represents an interface unit for convert 
ing a sign language ,nto a time sequential continuous pattern of n-th order vector, and reference numeral 2 represents 
is a s.gn language recognrtion unit. The sign language recognition unit 2 includes a reference sign language pattern gen- 
erating unit 26, a reference sign language pattern storage unit 27, and a continuous sign language recognition unit 28 
for recognizing a continuous sign language pattern. 

[0034] In this embodiment, by the data gloves 1 and interface unit 1\ the bending of the first and second articulations 
o each tinger is detected based upon a difference of light intensity passing through optical fibers attached to the data 
u 0 ? 8 I* , ?° Sfti0n (Spa,ia ' coordinates ) and direction of each hand are detected by sensors attached to the 
per second 9 * 1 ° f hand ' *"* t "° ^ ° f b ° ,h ha " dS ' are picked up about ,imes 

[0035] A sign language inputted from the data gloves 1 is converted into a time sequential continuous sign language 
pattern of n-th order vector. The converted continuous sign language pattern is supplied to the reference sign language 
pattern generating unit 26 as its input signal d1 a when a reference sign language pattern is generated, and supplied to 
the continuous s.gn language recognition unit 28 as its input signal d1b. The reference sign language pattern generat- 
ing unit 26 reads several sample sign language patterns of one word of a sign language of a user, and an average of 
hese sample patterns is used as a reference sign language pattern of the sign language word which is then stored in 
the reference sign language pattern storage unit 27. In this manner, reference sign language patterns for other sign lan- 
guage words are stored in the storage unit 27. The continuous sign language recognition unit 28 recognizes inputted 
continuous s,gn language patterns by sequentially matching them with reference sign language patterns d27 stored in 
tn© storage unit 27. 

[0036] Fig. 2 is a block diagram showing the reference sign language pattern generating unit 26. Reference numeral 
261 represents an average time length calculating unit for calculating an average time length of sample sign language 
patterns for each word .nputted several times by a user. Reference numeral 262 represents a pre-reference pattern 
storage unit for storing one pre-reference sign language pattern selected from the sample sign language patterns input- 
ted several times by the user, the pre-reference pattern being used as a reference of other sample sign language pat- 
terns. Reference numeral 263 represents a sample storage unit for storing the other sample sign language patterns 
excepttng the pre-reference pattern. Reference numeral 264 represents a matching unit for checking a matching 
between the pre-reference pattern and other sample sign language patterns. Reference numeral 265 represents an 
average pattern cateulat.ng unit for calculating an average of inputted several sample sign language patterns of each 
sion unrt S UP0 " ma,Ch reSUltS bY matChing unit 264 - Reference ""nieral 266 represents a pattern compres- 
[0037] All sample sign language patterns d1 a of each word inputted several times to the reference sign language pat- 

S«T r f ,n . 9 , lJn,, SUPP 'i ed t0 thS 3Verage ,ime ,en9,h ca,cu,at ''n9 ""it 261 . The average time length calculating 

unit 261 calculates an average of t.me lengths of all sample sign language patterns of each word inputted several times 
The sample s.gn language pattern having a time length nearest the calculated average time length is stored as the pre- 
Xrlnf VHj" pre ; re T T P atter " st ° ra 9* «"« 262. The other sample sign language patterns excepting the 
^-reference pattern are stored .n the sample storage unrt 263. As the pre-reference pattern to be stored in the storage 
unrt 262 any .nputted sample sign language pattern linearly normalized to the average time length may also be used 
The matching unrt 264 then sequentially matches the respective sample patterns d264 stored in the sample storage unit 
263 w.th the pre-reference pattern d263 stored in the pre-reference pattern storage unit 262. This pattern matching is 
executed by a dynam.c programming (DP) scheme with the fixed start and end points. With this DP scheme, the pre- 

^T?^ SamP ' e Pattem afe matChed 35 sh0Wn in Fi 9' 3 "berths limited condition that the start and 
end po.nts of both the patterns are always the same. With this matching, correspondence points between the pre-ref- 
erence pattern d263 and the sample patterns 264 are obtained. The matching unrt 264 outputs the checked sample pat- 
tern and correspondence po.nt .nformation d265. In accordance with the checked sample patterns and correspondence 
po.nt informatton d265. the average pattern calculating unit 265 calculates an average of correspondence points and 
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generates an average sign language pattern d266. Therefore, as shown in Fig. 4B, there is obtained the average sign 
language pattern which maintains the characteristic feature of the sample patterns even if they have nonlinear expan- 
sion/compression. The average sign language pattern d266 is then compressed by the pattern compression unit 266 
The reference sign language pattern generating unit 26 outputs the compressed average sign language pattern as the 
reference sign language pattern d26. In this manner, the reference sign language pattern for one sign language word 
of a user is obtained. Reference sign language patterns of other words to be recognized are also obtained by the refer- 
ence sign language pattern generating unit 26. 

[0038] Fig. 5 is a block diagram of the pattern compression unit 266. Reference numeral 2661 represents a velocity 
calculating unit for calculating a velocity from the time sequence of vectors. Reference numeral 2662 represents a min- 
imum velocity detecting unit for detecting a minimum value of calculated velocities. Reference numeral 2663 represents 
a velocity change calculating unit for calculating a change in the direction of the velocity. Reference numeral 2664 rep- 
resents a velocity change time detecting unit for detecting a time when the direction of the velocity changes. 
[0039] The pattern compression unit 266 compresses the inputted average sign language pattern d266 by detecting 
characteristic points of the inputted pattern. The velocity calculating unit 2661 calculates the velocity at each timing of 
the inputted pattern 266. The velocity of vector can be obtained by the following equation (1 ): 

(1) 

where p(t) is a pattern vector at time t, and is a velocity vector of the pattern at time t. 

[0040] The velocity at each timing obtained by the equation (1 ) is inputted to the minimum velocity detecting unit 2662 
and velocity change calculating unit 2663. The minimum velocity detecting unit 2662 detects the minimum velocity from 
the time sequence of inputted velocities, and outputs it as a characteristic point d2661 . Namely, the velocity calculating 
unit 2661 and minimum velocity detecting unit 2662 detect a kind of "pause". 

[0041] The velocity change calculating unit 2663 calculates a change angle of the direction of the velocity vector. The 
change angle of the direction of the velocity vector can be obtained by the following equation (2): 

where 0(v*(t),v*(t+1)) is an angle between velocity vectors v(t) and \T(t+1). 

[0042] Each change angle of the direction of the velocity vector is integrated by the velocity change point detecting 
un.t 2664 which outputs a characteristic point d2662 at which the sum of change angles exceeds a predetermined 
threshold value. Namely, the characteristic point is represented by a time Tj when the following inequality (3) is satisfied: 

Z WW*1))i8 (3 ) 
f-r, 

where 0 is a threshold value of the vector change angle. In this manner, such characteristic points are sequentially 
detected and outputted after the time Tj. The velocity change calculating unit 2663 and velocity change point detecting 
unit 2664 detect a kind of "locus'*. 

[0043] A characteristic point vector output unit 2665 outputs pattern vectors at the two characteristic points d2661 and 
d2662 and the time lengths between characteristic points, as a compressed sign language pattern d26. In this manner 
the average sign language pattern is converted into the compressed sign language pattern as shown in Fig 6 The 
compressed sign language pattern of each word is stored in the reference sign language pattern storage unit 27. 
[0044] Fig. 7 is a block diagram of the continuous sign language recognition unit 28. Reference numeral 281 repre- 
sents a pattern compression unit, reference numeral 282 represents a matching unit for checking a matching between 
a continuous sign language pattern and reference sign language patterns, and reference numeral 283 represents a ref- 
erence pattern candidate detecting unit for detecting a reference pattern candidate basing upon the match results bv 
the matching unit 282. 7 

[0045] A continuous sign language pattern inputted by a user is recognized while referring to compressed reference 
sign language patterns of respective words. 

[0046] A continuous sign language pattern dlb inputted to the continuous sign language recognition unit 28 is com- 
pressed by the pattern compression unit 281 The pattern compression unit 281 is similar to the pattern compression 
unit 266 used by the reference sign language pattern generating unit 26. The matching unit 282 sequentially matches 
the compressed continuous sign language pattern with the compressed reference sign language patterns d27 by using 
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a continuous DP scheme while nonlinearly changing the patterns in the time domain. A usual continuous DP scheme 
uses the restrictions (of DP paths) such as illustrated in Fig. 8 for the vector correspondence between the continuous 
sign language pattern and reference sign language patterns. In this pattern matching, distances between vectors within 
a "window" shown in Fig. 9 are repetitively calculated while moving the window. Because patterns nonlinearly com- 

5 pressed in the time domain are directly matched, the restrictions and weights of DP paths shown in Fig. 10 are used. 
With such restrictions and weights, the correspondence between patterns even if they are nonlinearly compressed in 
the time domain can be checked. Furthermore, a time length between characteristic points is used as the weight at 
each path, so that the time length can be reflected upon the distance between corresponding characteristic points for 
the accumulated distance in the DP matching. However, the correspondence between characteristic points is took even 

w if the difference between time lengths is too large, which may result in unnecessarily expanded/compressed time. From 
this reason, in addition to the above-described restrictions and weights, a distance corresponding to a ratio between 
time lengths is added. The distance between characteristic points is therefore given by the following equation (4): 

d(i,j) = w , x d 1 (i,j) + w 2 x d 2 (i,j) (4) 

where d^ij) is a distance defined by the i-th characteristic point vector of the continuous pattern and by the j-th char- 
acteristic point vector of the reference pattern, d 2 (i,j) is a distance corresponding to a ratio between time lengths for the 
distance defined by the i-th characteristic point vector of the continuous pattern and by the j-th characteristic point vec- 
tor of the reference pattern, and w-, and w 2 are weights. 
20 [0047] The distance between characteristic points of a vector is represented by a commonly used Euclidean distance. 
The distance may be defined by Mahalanobis distance or correlation coefficients. As the distance corresponding to a 
ratio between time lengths may be obtained from the following equation (5): 

25 J# . , f (x<i,i) -l.O) 2 ; r<i, j)*1.0 



35 'v>n i 

Z M*> 

where t,(k) is a time length at the k-th characteristic point of a continuous pattern, tj(k) is a time length at the k-th char- 
40 acteristic point of a reference pattern, i 0 is a start point of characteristic points of the continuous pattern corresponding 
to j, and j 0 is a start point of characteristic points of the reference pattern corresponding to i. 

[0048] In the equation (5), if a characteristic point has a plurality of corresponding characteristic points, the ratio is 
calculated from the time lengths of all associated characteristic points. The equation (5) is defined by a function which 
makes the distance corresponding to the time length ratio become 0.0 when the time length ratio is 1 .0. Other functions 
may be used which calculate the distance by using the ratio between time lengths at corresponding characteristic 
points. 

[0049] The continuous DP matching with the above-described DP paths and distance calculation is performed by 
sequentially moving the window shown in Fig. 11. This distance calculation is performed by the following asymptotic 
expansion equations: - * . 

Wffli (6) 
9(-1.j) = 9(i,-1) = M (sufficiently large) 
9('.0) - (t„+t Jo ) xd(i,0) 
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gi(M) = g(i-i.i) + (t„+tjj) xd(ij) 
g2(g) = g(Hi.H) + (t n + tj|)xd(ij) 
gs(U)-g(!J-i) + (tii-rtj|) xd(ij) 



g 2 < i * J> 9 2 (i* J") 



75 



20 



ic^i.J) ' c 2 (i,j) ' c 3 (i,j) 
■•n f 9ri(i ' j) g z (i ' J ' } %< i *J> 



srU.j) = | g 2 {i,j) ; m_ . , , , 

\c x (i,j) c 2 (i,j) C 3 (i,j) 

\c,(i,j) c 2 (i,j) c 3 <i, j) 



> = 



9 2 (i, J) 
c 2 <i.j) 
g 3 (i. J) 
c- 3 (i.j) 



(a) 
(c) 



25 



c(-1,j) = c(i,-1) = 0 
c(i,0) = t B + t Jo 



30 



c(i/ j) 



c(i-l,j) + 
c(i-l, j-l) 
c(i,j-i) + 



+ (tii+tjj) 
(tu+tjj) 



for 
for 
for 



(a) 
(b) 
(c) 



40 



45 



50 



where d(i,J) is an accumulated distance at the i-th characteristic point of the continuous pattern relative to the reference 
pattern, t|j is a time length at the i-th characteristic point of the continuous pattern, and tjj is a time length at the j-th char- 
acteristic point of the reference pattern. 

[0050] A correspondence between characteristic points such as shown in Fig. 12 is obtained by the above-described 
pattern matching. 

[0051] By the continuous DP matching, the accumulated distance d282 at each characteristic point is obtained for 
each reference sign language pattern. The reference pattern candidate detecting unit 283 obtains a minimum value of 
accumulated distances smaller than a predetermined threshold, the point of the minimum value being detected as the 
position of the reference pattern candidate. The reference pattern candidate detecting unit 283 outputs as a reference 
pattern candidate d2 the detected reference pattern, and its start and end points and accumulated distance. 
[0052] The data gloves 1 and interface unit 1' of the example may be replaced by a different type of device so long as 
they can convert a sign language into a time sequential continuous pattern of n-th order vector. For example, sign lan- 
guage data picked up by a television camera and converted into time sequential data of n-th order vector may also be 
used. 

[0053] Although the first embodiment is realized by hardware, software may be used. In this case, the reference sign 
language pattern generating unit, reference sign language pattern storage unit, and continuous sign language recogni- 
tion unit may be realized flexibly by different computers or by a single computer. Figs. 13 to 1 5 are flowcharts illustrating 
the operation of the reference pattern generating method and sign language recognition method realized by software. 
[0054] Fig. 13 is a flow chart illustrating the operation of the reference pattern generating unit. At Step 1301, it is 
checked whether the reference pattern generating process has been completed for all sign language words for which 
the reference patterns are to be generated. If completed, the operation is terminated. If not, an average of time lengths 
of sample patterns of a sign language word is calculated at Step 1302. At the next Step 1303, a sample pattern having 
a time length nearest the average time length is searched and stored as a pre-reference pattern. At Step 1304 it is 
checked whether the processes at Steps 1 305 and 1306 have been performed for all the remaining sample patterns. If 
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not, at Step 1305 the DP matching with the fixed start and end points between the pre-reference pattern and a sample 
pattern is executed. At Step 1306, data at each timing of the sample pattern is added to the data at the corresponding 
time of the pre-reference pattern in accordance with the DP matching results, and the number of data added to the data 
at the corresponding time is stored. If all the sample patterns have been completed at Step 1 304 an average of added 
data at each timing of the pre-reference pattern is calculated. At Step 1 308, the averaged patterns compressed to use 
it as a reference pattern. 

[0055] Fig. 1 4 is a flow chart illustrating the operation of the pattern compression unit. At Step 1 401 , the velocity vector 
at each timing is obtained from the input reference pattern, by using the equation (1 ). At Step 1 402 it is checked whether 
the processes at Steps 1403 to 1409 have been completed for all the timings of the inputted reference pattern If not 
the processes at Steps 1 403 and 1 404, and the processes from Step 1 405 to Step 1 409, are executed. At Step 1 403 it 
is checked whether the velocity vector at that time takes a minimum value. If minimum, the time is stored as a charac- 
teristic point at Step 1 404 and the flow returns to Step 1402. If not minimum, the flow directly returns to Step 1402 At 
Step 1 405, an angle change of velocity vectors at that time and at the preceding time is obtained, by using the equation 
(2). At the next Step 1406, the angle change at this time is added to the accumulated angle value at the preceding time. 
At Step 1407, it is checked whether the accumulated value of the angle change has exceeded a threshold value by 
using the inequality (3). If in excess of the threshold value, the time is stored as a characteristic point at Step 1408 At 
the next Step 1409, the accumulated angle change value is cleared to prepare for obtaining the next characteristic 
point, and the flow returns to Step 1402. In the above operations, returning to Step 1402 is performed only when the 
processes at Steps 1 403 and 1404 and the processes from Step 1 405 to Step 1409 are both completed. If it is judged 
at Step 1 402 that the inputted reference pattern has been processed for all the timings, a pattern vector at each char- 
acteristic point and a time length between characteristic points are obtained and stored as a compressed pattern at 
Step 1410. 

[0056] Fig. 1 5 is a flow chart illustrating the operation of the sign language recognition unit. At Step 1 501 a sign lan- 
guage pattern is inputted. The inputted sign language pattern is compressed at Step 1502. It is checked at Step 1503 
whether the processes from Step 1504 to Step 1506 have been completed for all stored reference patterns. If there is 
any reference pattern still not checked, a continuous DP matching is performed between the inputted continuous sign 
language pattern and the reference pattern at Step 1504, by using the equations (4) to (6). At Step 1 505, a minimum 
value of the accumulated distance is obtained from the time sequential data of the accumulated distance calculated by 
the DP matching. The time corresponding to the minimum value is stored as the position of the recognized word and 
the flow returns to Step 1503. If it is judged at Step 1503 that all the reference patterns have been checked the sign 
language recognition process is terminated. 
[0057] According to the example described above, an excellent reference sign language pattern can be generated by 
calculating an average of sample sign language patterns based on a correspondence relation obtained through the DP 
matching, without damaging the characteristic feature of a sign language which might otherwise be damaged by non- 
35 linear expansion/compression in the time domain. Furthermore, a general sign language having a series of motions of 
fingers and hands can be continuously recognized by the continuous DP matching between continuous sign language 
patterns and reference sign language patterns. Still further, characteristic points of reference sign language patterns 
and continuous sign language patterns are derived therefrom, and the patterns are compressed in accordance with the 
characteristic points. It is therefore possible to efficiently compress sign language patterns, and to recognize sign lan- 
40 guage patterns at high speed through a direct matching between compressed sign language patterns. 
[0058] Next, an embodiment of the present invention will be described. 

[0059] This embodiment basically uses the continuous DP matching of the example above and provides a sign lan- 
guage translation system intended for use by a plurality of different users. This system is an integrated total system for 
sign language translation wherein not only sign language is inputted from data gloves but also the portrait of a sign Ian- 
45 guage user is inputted to provide the facial expression. 

[0060] This embodiment of the invention will be described in detail with reference to Figs. 1 6 to 24. First referring to 
Fig. 16. the overall structure of a sign language translation system 100 and the function and data transferal each ele- 
ment will be described. This system is broadly divided into eight sections, including a sign language recognition section 
(1 , 2. 3), a look recognition section (4, 5), a data input/output section (6, 7), a speech recognition section (9 10) a key- 
board section (8), a display section (13, 14, 15), a voice generating section (11. 12), a processor (computer) section 
(16). 

[0061] The operation of the sign language system when used between a sign language user and another person will 
first be described. In order to translate a sign language given by a sign language user, time sequential information (d1 ) 
representing the shapes of fingers and the directions and positions of hands of the sign language user is sent from data 
gloves 1 to a sign language recognition unit 2 via an interface unit 1\ The sign language recognition unit 2 dynamically 
matches the time sequent.al information d1 with sign language word dictionary data to transform the words contained 
in the informat.on (d1 ) into symbolized words (d2). A spoken language translation unit 3 converts the sequentially input- 
ted symbolized words (d2) into a spoken language (d3) by supplementing proper particles and the like between words 
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[0062] In addition to the above, the portrait (d4) of the sign language user is picked up by a TV camera 4. A look rec- 
ognition unit 5 recognizes the look or emotion (smiling, sad, and etc.) of the face and outputs the degree (d5) of emo- 
tion. ' 

[0063] If voices are used for the translation output medium, the data (d3 and d5) are supplied to the computer 1 6 and 
5 converted into a spoken language (d12) added with simple emotion-representing adjectives or into voice synthesizing 
parameters (d1 2) which are then sent to a voice synthesizing unit 12. In accordance with the converted data (d12) the 
voice synthesizing unit 12 synthesizes voices having the emotion (d5) corresponding to the spoken language (d3) 
[0064] If texts are used for the translation output medium, the data (d3 and d5) supplied to the computer 16 are con- 
verted into a spoken language (d123) added with simple emotion-representing adjectives, and sent to a monitor 13 to 
10 display a text. 

[0065] If other sign languages are used for the translation output medium, the computer 1 6 derives words (d1 4) from 
the spoken language (d3) added with simple emotion-representing adjectives, and sends them to a sign language CG 
generating unit 1 4 to display corresponding sign language words on the monitor 1 3. 

[0066] Another person understood the translation output supplied from the translation output medium communicates 
15 by using a keyboard 8, a microphone 9, or data gloves 1 (which may be provided at another sign recognition section) 
For the text conversation, a text (d8) is inputted from the keyboard 8 and displayed on the monitor 13. Alternatively a 
text (d8) constituted by only simple words is inputted from the keyboard 8, and converted into corresponding sign lan- 
guage words by the sign language CG generating unit 14 to display them on the monitor 13. 

[0067] For the voice conversation using the microphone 9, voice data for each word is recognized and converted into 
20 symbolized words (d9) by a voice recognition u nit 1 0. The symbolized words are displayed on the monitor 1 3 in the form 
of text, or alternatively in the form of corresponding sign language words converted by the sign language GG generating 
unit 14. For the sign language conversation using the data gloves 1 , the sign language is outputted as voices, texts, or 
sign language words, in the manner described previously. 

[0068] The computer 1 6 controls the entirety of the processor section and performs simple data conversion. The voice 
?5 recognition unit 10 and voice synthesizing unit 12 can be easily realized by using already developed techniques (e g 
"Auditory Sense and Voices", The institute of Electronics and Communication Engineers, compiled by Tanetoshl 
MIURA, 1980). 

[0069] In the following, the details of the sign language recognition section (1, 2, 3), look recognition section (4 5) 
and data input/output section (6, 7) shown in Fig. 1 6 will be given. 

to [0070] Fig. 1 7 shows the detailed structure of the sign language recognition unit 2. The functions of the recognition 
unit 2 include: * 

Function (1 ): It is assumed that sign language word data of a first sign language user (or sign language word data 
common to a plurality of sign language users) has already been registered in a dictionary. Prior to using the sign 

?5 language of a second sign language user, sign language word data of the second sign language user is picked up 

from data gloves for a predetermined number of words. The sign language word data of each word is compared 
with the corresponding data in the sign language word dictionary, and is calibrated to compensate for a personal 
difference between sign language data, i.e., to make the sign language data of the second sign language user more 
similar to that of the first sign language user. This calibration is executed only at the set-up of the system. 

<o Function (2): Time sequential sign language word data of the second user after the calibration is dynamically 

matched with the sign language word dictionary data of the first user to recognize words contained in the time 
sequential data. Function (2) basically uses the continuous DP matching described with the first embodiment. 

[0071] The operation of realizing Function (1 ) will be described. 

[0072] The second sign language user inputs one sign language word d1 after another among the predetermined 
words to a matching unit (1) 24. The corresponding sign language word dictionary data d22 is read from a sign lan- 
guage word dictionary storage unit 22 and inputted to the matching unit (1) 24. The matching unit (1) 24 performs a 
start/end point fixed dynamic matching. The sign language word dictionary storage unit 22 corresponds to the reference 
sign language pattern storage unit 27 of the first embodiment, and is assumed that reference sign language patterns of 
word S St US6r Patt8mS common to a P ,uralit y of users) are stored in correspondence with respective sign language 

[0073] Different from general finger spellings, a sign language word is represented by the motion of fingers and hands 
Therefore, as shown in Fig. 19, the data obtained from the data gloves 1 is represented by a time sequential multidi- 
mensional function P(t) (a function formed by a number of one-dimensional functions) of f1 1 to f52, hd1 to hd3, and hpx 
to hpz. Functions f1 1 to f52 represent the angles of finger articulations (e.g., f1 1 represents the finger angle at the first 
articulation of a thumb, f 1 2 represents the finger angle at the second articulation of the thumb, and f21 to f52 represent 
the finger angles at the first and second articulations of the other fingers). Functions hd1 to hd3 represent the directions 
of the palm of a hand, and functions hpx to hpz represent the position of the hand. 



11 



20 



25 



30 



35 



45 



50 



55 



BP 0 585 098 B1 

[0074] The sign language word dictionary data for each word is represented in a similar manner by a time sequential 
multidimensional function p(t). 

\°°l 5 L J h ! tlme sequential data on| y for one t*nd is shown in Fig. 19 for the purpose of simplicity. In practice data 
for both hands is used so that the dimension of the function is doubled. 

[0076} As shown in Fig. 19, there is certainly a personal difference between the inputted sign language word data of 
the second user and that of the first user (or average person). To absorb such difference, it is necessary to convert the 
inputted sing language word data of the second user into corresponding data of the first user. To this end a start/end 
point fixed dynamic matching is first executed to obtain optimum correspondence points therebetween. 
[0077] The start/end point fixed dynamic matching is a kind of dynamic template matching wherein the start and end 
points of the input data and dictionary data are fixed while the other points are allowed to be moved in the time domain 
This dynamic : matching may use a known technique (refer to "Pattern Recognition and Learning Algorithms" by Yoshi- 
nonKAMISAKAandKazuhikoOZEKI,BunichiSogoShuppan,p.91). 

[0078] Several sample words are subject to this matching to obtain correspondence information d24 such as shown 
in Fig. 19. The information d24 is correspondence data between the input data of the second user and the dictionary 
P(b)* °p(c) e (cO "o ^ff ' n ( C, ) UdeS aS Sh ° Wn F ' 9 ' 1 9 data P(A)> P(B) * P(C) ' P(D) ' P W> P < E )« p ( p )- p ( Q ).- and p(a), 
[0079] Returning back to Fig. 17, reference numeral 25 represents a selector unit. When Function (1) is to be realized 
the selector unit 25 does not select data d1 , but selects the correspondence data d24 and sends it to a conversion unit 

[0080] As shown in Fig. 18, the conversion unit 21 includes a layer-type neural network 21 1 and selectors 212 and 
213. In the conversion unit 21, the selector 213 selects the data P(A), P(B), P(C), P(D), P(E) P(F) P(G) from the 
inputted data d25 and inputs them to the neural network 21 1, and the selector 214 uses the data p(a) p(b)"p(c) p(d) 
p(e), p(f), p(g),.„ as the learning data d212 of the neural network 21 1 . 

[0081] The leaning of the neural network 21 1 is made by using the several sample data d25 to obtain weight coeffi- 
cients of the neural network 211 which define the rule of conversion from the time sequential data P(t) of the second 
user to the dictionary data p(t) of the first user. 
[0082] The operation of realizing Function (2) will be described. 

[0083] In realizing Function (2), the selector unit 25 shown in Fig. 1 7 selects the data d1 , inputs it as the data d25 to 
the selector 21 3 shown in Fig. 1 8. The data d25 selected by the selector 21 3 is supplied to the neural network 21 1 and 

mno Se,GCt ° r 212 S8,eCtS the ° UtpUt ° f the neural network 21 1 as tne dat * d212 which is outputted as the data d2l" 
[0084] The inputted sign language time sequential data d1 of the second user is a time sequence of words of a sign 
language, different from the above -described calibration. The time sequential data d1 of the second user is converted 
by the learnt layer-type neural network 211 into time sequential data d21 with the personal difference being removed 
As described previously, this converted time sequential data d21 is similar to the dictionary data of the first user 
because of Function (1 ) provided by the neural network 211. 

[0085] Another matching unit (2) 23 shown in Fig. 1 7 matches the converted time sequential data d21 with the dic- 
tionary data to detect a train of words included in the sign language time sequential data d21 

[0086] The operation of the matching unit (2) 23 will be described with reference to Fig. 20, the fundamental operation 
being the same as the first example. 

[0087] In Fig. 20, d21 represents the inputted sign language time sequential data, and d22 represents the sign lan- 
guage word dictionary data. The inputted time sequential data d21 is represented by a function of a multidimensional 
\*l % 1 prev,oust y- The si 9 n language word dictionary data d22 includes, for example as shown in Fig 20 
data A, B, C,... which are dynamically matched with the inputted time sequential data d21 . 

I0 ° 8 S J? S dynamic matchin 9 * performed by using a method called a continuous DP matching scheme ("Continu- 
ous Word Recognition using Continuous DP", by OKA, the Speech Study Group of Acoustical Society of Japan S78- 
£0, pp.1 45 to 1 52, 1 978). 

[0089] The fundamental operation of this dynamical matching is to scan each sign language word dictionary data on 
the inputted sign language time sequential data to match the former with the latter at each timing, and to recognize a 
word in the latter at the position (timing) with the best similarity. 

I009 ,ol o? Urin9 thS matchin 9' expansion/compression is allowed to a certain degree in the time domain. The matching 
unit (2) 23 outputs time sequentially recognized words as data d2. 

[0091] A spoken language conversion unit 3 adds dependent words such as articles, prepositions, particles auxiliary 
verbs to the recognized words in accordance with the conversion rules, and outputs a symbolized language more sim- 
ilar to a spoken language, as disclosed in U.S. Patent Application Serial No. 08/029046 filed March 9 1 993 
[0092] Next, the details of the look recognition unit 5 will be described with reference to Figs 21 to 23 
[0093] Fig. 21 shows the detailed structure of the look recognition unit 5. An input d4 represents a portrait of a sign 
language user, and output d5 represents a type of emotion and a numerical degree of emotion (e.g., a delight degree 
of 60%, a grief degree of 1 0%). 
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[0094] A block 41 shown in Fig. 21 performs a position detection (d410) of the eyes, mouth, nose and eyebrows of 
the portrait, and a cut-out of the eye image (d414), mouth image (d413), nose image (d412), and eyebrow image 
(d411). 

[0095] Look matching units (424, 423. 422, 421 ) detect the emotion degrees of the partial images (d424, d423, d422 
d421) while referring to the reference look pattern dictionaries (434. 433, 432, 431). 

[0096] In accordance with these emotion degrees and the emotion degrees to be determined from the positions 
(d41 0) of the eye, mouth, nose, and eyebrows, the total judgement unit 44 totally determines the final emotion degree 
and type and outputs them. 

[0097] Fig. 22 shows the details of the position detection and image cut-out. 

[0098] A block 41 1 derives a face image d41 1 0 without the background of the portrait by obtaining an absolute differ- 
ence between a pre-registered background 41 2 and the input portrart d4. 

[0099] A block 41 3 first detects the position and size of the total face image d41 1 0 by using an image processing tech- 
nique such as projection distribution. 

[0100] The block 413 then detects the position of each partial image such as the eye image, mouth image, nose 
image, and eyebrow image, through a template matching between the total face image d41 1 0 without background and 
each standard partial image pattern 414 which is used as a template, while referring to reference position information 
414. 

[0101] In this case, if the reference position information 414 is normalized by the position and size of the detected 
total face image d4110, the range of the area to be subject to the template matching can be narrowed, realizing an 
improved precision and efficiency of the matching. 

[01 02] Once the position of each partial image can be detected, the image can be cut out easily. An output d41 0 rep- 
resents the position of each partial image, and d414, d413, d412, and d41 1 represent the eye image, mouth image, 
nose image, and eyebrow image, respectively. 

[0103] Fig. 23 shows the detailed structure of the mouth matching unit 423 of the look recognition unit 5. 

[0104] The configuration characteristics of the mouth image d413, such as the area, aspect ratio, and xy projection 

distribution, are derived by a characteristic deriving unit by an image processing technique. 

[0105] The configuration characteristics of the mouth image d413 are compared with the pre-registered characteris- 
es such as delighted mouth and sad mouth configurations (through comparison between distances on a configuration 
characteristic space) to determine the most similar configuration characteristic. Then, data d423 is outputted which rep- 
resents the type of the emotion belonging to the determined most similar configuration characteristic and the emotion 
degree in inverse proportion to the distance on the configuration characteristic space. 

[0106] The total judgement unit 44 of the look recognition unit 5 determines the final emotion type on the majority 
basis from the emotion types and degrees obtained from the configuration characteristics of respective partial images 
and from the emotion types and degrees obtained from the coordinate positions of respective partial images. An aver- 
age of emotion degrees belonging to the same final emotion type is used as the final emotion degree (d5). 
[0107] In determining the emotion type and degree to be obtained from the coordinate position of each partial image, 
the coordinate position at each point of a reference emotion image is stored in advance. The sum of differences of coor- 
dinate positions at respective points between the input image and reference image is obtained and used for the deter- 
mination of the emotion type and degree. Namely, the emotion type of the reference image providing the minimum sum 
of differences .s used as the final emotion type, and the emotion degree in inverse proportion to the minimum is used 
as the final emotion degree. 

[0108] The data input/output section (6, 7) shown in Fig. 16 will be described next. 

[0109] This section is used for storing data (1 ) in a floppy disk or reading it therefrom. the data (1) being original data 
from the data gloves inputted by a sign language user or the text data converted into a spoken language and for storing 
data (2) in the floppy disk 6 or reading it therefrom, the data (2) being the parameters (weight coefficients) of the layer- 
type neural network of the conversion unit 21 of the sign language recognition section 2. 

[0110] The read data (1) is sent to the monitor at the speech synthesizing unit or to the monitor at the sign language 
CG generating unit to display it. The read data (2) of parameters (weight coefficients) is set to the layer-type neural net- 
work of the conversion unit 21 . 

[01 1 1] Fig. 24 shows the structure of the sign language CG generating unit 14. Word information indicated by d1 4 in 
Fig. 24 is supplied to an address generator 142 which generates the address of a CG word pattern corresponding to 
the supplied word information. A CG word pattern dictionary sends the designated CG word pattern d131 to the monitor 
13 to display it. 

[01 1 2] Each section of the sign language translation system has been described in detail. With this structure, it is pos- 
sible to realize the system capable of solving the conventional issues still not settled. 

[0113] In the embodiment of the sign language translation system 100, wired data transfer has been assumed. All or 

part of the data transfer may be replaced by wireless data transfer to provide a better operability of the system. 

[011 4] Fig. 25 shows a sign language translation system wherein a plurality of sign language translation systems 100 
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described above are interconnected by a local area network (LAN) 103 (or wireless LAN) via LAN interfaces 1 01 . With 
such a system, data communication or an access to a database 102 by sign languages or gestures becomes possible 
within or between public or other facilities such as hospitals, police stations, and public offices. 

[01 1 5] According to the embodiment, a continuous sign language even with a personal difference can be dynamically 
5 recognized and converted into a spoken language. 

[01 1 6] It is also possible to recognize the look or emotion of a sign language user, allowing to generate a spoken lan- 
guage with emotion. 

[01 17] It is also possible to make it easy to have a conversation between a sign language user and an ordinary user. 
[0118] ft is apparent that the sign language translation apparatus and system of the second embodiment may be 
w applied to the sign recognition apparatus of the first example. 

Claims 
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1. A sign language translation system having input means (1 , V) for inputting a continuously expressed sign language 
as time sequential data, means (2*) for recognizing a sign language from said time sequential data, and means (3) 
for translating said recognized sign language into a spoken language, said sign language recognition means com- 
prising: 

a sign language word dictionary (22) for storing first sign language time sequential data of each sign language 
word; 

means (21 ) for calibrating second sign language time sequential data inputted from said input means so as to 
make said first sign language time sequential data stored in said sign language word dictionary correspond to 
said second sign language time sequential data; 

means (23) for matching said second sign language time sequential data calibrated by said calibration means 
with said first sign language time sequential data stored in said sign language word dictionary, and recognizing 
the sign language word corresponding to said second sign language time sequential data; 
means (4) for inputting a portrait of a sign language user, look recognition means (5) for recognizing the look 
of the portrait to obtain the emotion type and the emotion degree (intensity), and processor means (16) for A) 
inputting a spoken language outputted from said spoken language translation means (3) and the emotion type 
30 and degree outputted from said look recognition means (5), and B) outputting a spoken language added with 

emotional adjectives. 

2. A sign language translation system according to claim 1 , further comprising: 

55 matching means (24) for receiving said second sign language time sequential data of a sign language word 

and said first sign language time sequential data corresponding to said second sign language time sequential 
data, obtaining and outputting correspondences between both said sign language sequential data at each tim- 
ing; and 

selection means (25) for selecting one of said second sign language time sequential data and an output of said 
40 matching means, and outputting said selected one to said calibration means, 

wherein said calibration means learns recognition parameters for the calibration in accordance with an output 
of said matching means. 

3. A sign language translation system according to claim 1 , wherein said spoken language translation means (3) adds 
45 a dependent word to the sign language word outputted from said sign language recognition means, in accordance 

with a rule, and outputting a spoken language. 

4. A sign language translation system according to claim 1, further comprising sound output means (11,12) for syn- 
thesizing and outputting sounds, wherein said processor means (1 6) outputs sounds corresponding to the emotion 

50 type and degree. 

5. A sign language translation system according to claim 1 , further comprising text output means (1 3) for outputting a 
text, wherein said processor means (1 6) outputs said text corresponding to the emotion type and degree. 



55 6. 



A sign language translation system according to claim 1 , further comprising sign language graphics means (1 3, 1 4) 
for outputting a sign language as CG, wherein said processor means (16) outputs said CG corresponding to the 
emotion type and degree. 
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7. A sign language translation system according to claim 1, further comprising voice input means (9, 10) having a 
microphone and voice recognition means. 

8. A sign language translation system according to claim 1 , further comprising character input means (8) for inputting 
a character from a keyboard. 

9. A sign language translation system according to claim 1 , wherein said sign language translation system is installed 
at each of a plurality of stations connected to a local area network so that information can be transferred between 
a plurality of said sign language translation systems. 

PatentansprOche 

1. Zeichensprachen-Ubersetzungssystem, welches ein Eingabemittel (1, V) zum Eingeben einer kontinuierlich aus- 
gedruckten Zeichensprache als Zeitfolgedaten, ein Mittel (2') zum Erkernen einer Zeichensprache aus den Zeitfol- 
gedaten und ein Mittel (3) zum Ubersetzen der erkannten Zeichensprache in eine gesprochene Sprache aufweist, 
wobei das Zeichensprache-Erkennungs-Mittel aufweist: 

ein Zeichensprachen-Worterbuch (22) zum Speichern erster Zeichensprache-Zeitfolgedaten von jedem Zei- 
chensprachen-Wort; 

ein Mittel (21) zum Kalibrieren von zweiter uber das Eingabe-Mittel eingegebener Zeichensprache-Zeitfolge- 
daten, urn so die ersten Zeichensprache-Zeitfolgedaten, die in dem Zeichensprachen-Worterbuch gespeichert 
sind zu den zweiten Zeichensprache-Zeitfolgedaten entsprechend zu machen; 

ein Mittel (23) zum Gleichheitsuberprufen der durch das Mittel zum Kalibrieren kalibrierten zweiten Zeichen- 
sprache-Zeitfolgedaten mit den ersten Zeichensprache-Zeitfolgedaten, welche in dem Zeichensprachen-Wor- 
terbuch gespeichert sind und zum Erkennen des entsprechenden Zeichensprachen-Wortes zu den zweiten 
Zeichensprache-Zeitfolgedaten; 

ein Mittel (4) zum Eingeben eines Portrat von einem Zeichensprachen-Benutzer, Blickerkennungsmittel (5) 
zum Erkennen des Blickes von dem Portrat zum Erhalten der Emotionsart und des Emotionsgrades (Intensi- 
ty), und Prozessor-Mittel (16) zum A) Eingeben einer von dem GesprocheneSprache-Ubersetzungsmittel (3) 
ausgegebenen gesprochene n Sprache, und der von dem Blickerkennungsmittel (5) ausgegebenen Emotions- 
art und Emotionsgrades, und B) Ausgeben einer urn emotionale Adjektive erweiterten gesprochene n Sprache. 

2. Zeichensprachen-Ubersetzungssystem nach Anspruch 1 , welches weiter aufweist: 

ein Gleichheitsuberprufungs mittel (24) zum Empfangen der zweiten Zeichensprache-Zeitfolgedaten von 
einem Zeichensprachenwort und der ersten Zeichensprache-Zeitfolgedaten entsprechend den zweiten Zei- 
chensprache-Zeitfolgedaten, zum Erhalten und zum Ausgeben von Entsprechungen zwischen beiden der Zei- 
chensprache-Zeitfolgedaten bei jeder gewahften Zeit; und 

ein Auswaldmittel (25) zum Auswahlen einer von den zweiten Zeichensprache-Zeitfolgedaten und eine Aus- 
gabe von dem Gleichheitsuberprufungsmittel und zum Ausgeben dieser Ausgewahlten an das Mittel zum Kali- 
brieren, 

wobei das Mittel zum Kalibrieren Erkennungsparameter fur das Kalibrieren in Ubereinstimmung mit einer Aus- 
gabe des Gleichhertsuberprufungsmhtels erlernt. 

3. Zeichensprachen-Ubersetzungssystem nach Anspruch 1 , wobei das Gesprochene-Sprache-Ubersetzungsmittel 
(3) ein abhangiges Wort zu dem von dem Zeichensprachen-Erkennungsmittel ausgegebenen Wort hinzufugt in 
Ubereinstimmung mit einer Regel und zum Ausgeben einer gesprochenen Sprache. 

4. Zeichensprachen-Ubersetzungssystem nach Anspruch 1, weiter aufweisend Klang-Ausgabe-Mittel (11, 12) zum 
Synthetisieren und Ausgeben von Klangen, wobei das Prozessormittel (1 6) die zu der Emotionsart und dem Grad 
der Emotion entsprechenden Klange ausgibt. 

5. Zeichensprachen-Ubersetzungssystem nach Anspruch 1 , weiter aufweisend Text-Ausgabe -Mittel (1 3) zum Ausge- 
ben eines Textes, wobei das Prozessormittel (1 6) den zu der Emotionsart und dem Grad der Emotion entsprechen- 
den Text ausgibt. 



6. Zeichensprachen-Ubersetzungssystem nach Anspruch 1, weiter aufweisend Zeichensprachen-Grafik-Mittel (13, 
1 4) zum Ausgeben einer Zeichensprache als CG, wobei das Prozessormittei (1 6) die zu der Emotionsart und dem 
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Grad der Emotion entsprechende CG ausgibt. 

7. Zeichensprachen-Ubersetzunqssystern nach Anspruch 1 , welter aufweisend Sprach-Eingabemrttel (9 10) die ein 
MikrofonundSpracherkennungsmittelaufweisen. ' 

8. Zeichensprachen-Ubersetzungssystem nach Anspruch 1 , weiter aufweisend Zeichen-Eingabemrttel (8) zum Ein- 
geben eines Zeichens von einer Tastatur. 

9. Zeichensprachen-Ubersetzungssystem nach Anspruch 1, wobei das Zeichensprachen-Ubersetzungssystem 
mstall.ert .st auf jeder von einer Mehrzahl von Stationen, welche mit einem lokalen Gebietsnetzwerk verbunden 
sind, so da(3 Information zwischen einer Mehrzahl von den Zeichensprachen-Ubersetzungssystemen ubermittelt 
werden kann. 

Revendlcatlons 



1. 



Systeme de traduction de langage des signes comportant des moyens d'entree (1 , V) pour entrer un langage des 
signes expnme cfune maniere continue sous la forme de donnees sequentielles, des moyens (2') pour reconnaltre 
un langage des signes a partir desdites donnees sequentielles, et des moyens (3) pour traduire iedrt langage des 
signes reconnu en un langage parle, lesdits moyens de reconnaissance de langage des signes comportant : 

un dictionnaire de mots de langage des signes (22) pour memoriser des premieres donnees sequentielles de 
langage des signes de chaque mot de langage des signes ; 

des moyens (21) pour calibrer des secondes donnees sequentielles de langage des signes entrees a partir 
desdits moyens d'entree de maniere a obtenirque lesdites premieres donnees sequentielles de langage des 
signes memonsees dans Iedrt dictionnaire de mots de langage des signes correspondent auxdites secondes 
donnees sequentielles de langage des signes ; 

des moyens (23) pour confronter lesdites secondes donnees sequentielles de langage des signes calibrees 
par lesdits moyens de calibrage auxdites premieres donnees sequentielles de langage des signes memori- 
sees dans Iedrt dictionnaire de mots de langage des signes, et reconnaTtre le mot de langage des signes cor- 
respondant auxdites secondes donnees sequentielles de langage des signes ; 

des moyens (4) pour entrer un portrait d'un utilisateur de langage des signes, des moyens de reconnaissance 
de physionomie (5) pour reconnaitre la physionomie du portrait afin d'obtenir le type d'emotion et le degre 
d'emotion (intensrte), et des moyens de trartement (1 6) pour A) recevoir en entree un langage parte delivre en 
sort.e par lesdits moyens de traduction de langage des signes (3) et le type et le degre d'emotion delivres en 
sortie par lesdits moyens de reconnaissance de physionomie (5), et B) delivrer en sortie un langage parle com- 
plete par des adjectifs emotionnels. 

Systeme de traduction de langage des signes selon la revendication 1 , comportant en outre : 

des moyens de confrontation (24) pour recevorr lesdites secondes donnees sequentielles de langage des 
signes d'un mot de langage des signes et lesdites premieres donnees sequentielles de langage des signes 
correspondant auxdites secondes donnees sequentielles de langage des signes, obtenir et delivrer en sortie 
des correspondances entre les deux ensembles de donnees sequentielles de langage des signes a chaque 
instant ; et 

des moyens de selection (25) pour selectionner lesdites secondes donnees sequentielles de langage des 
signes ou une sortie desdits moyens de confrontation, et envoyer ce qui a ete selectionne auxdits moyens de 
calibrage, ' 

dans lequel lesdits moyens de calibrage apprennent des parametres de reconnaissance pour le calibrage con- 
forme me nt a une sortie desdits moyens de confrontation. 

Systeme de traduction de langage des signes selon la revendication 1 , dans lequel lesdits moyens de traduction 
de langage parle (3) ajoutent un mot dependant au mot de langage des signes delivre en sortie par lesdits moyens 
de reconnaissance de langage des signes, conformement a une regie, et delivrent en sortie un langage parle. 

Systeme de traduction de langage des signes selon la revendication 1, comportant en outre des moyens demis- 
sion sonore (11,12) pour synthetiser et emettre des sons, dans lequel lesdits moyens de traitement (16) emettent 
des sons correspondant au type et au degre d'emotion. 
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5. Sy steme de traduction de langage des signes sebn la revendication 1 , comportant en outre des moyens de deli- 
vrance de texte (13) pour delivrer un texte, dans lequet lesdits moyens de tra'rtement (16) delivrent ledit texte cor- 
respondant au type et au degre d'emotion. 

5 6. Sy steme de traduction de langage des signes selon la revendication 1 , comportant en outre des moyens de gra- 
phismes de langage des signes (13, 14) pour delivrer un langage des signes sous la forme de graphismes, dans 
lequel lesdits moyens de traitement (16) delivrent ledit graphisme correspondant au type et au degre d'emotion. 

7. Systeme de traduction de langage des signes selon la revendication 1 , comportant en outre des moyens d 'entree 
10 vocale (9, 10) ayant un microphone et des moyens de reconnaissance vocale. 

8. Systeme de traduction de langage des signes selon la revendication 1 , comportant en outre des moyens d'entree 
de caractere (8) potf r entrer un caractere a partir d'un clavier. 

15 9. Systeme de traduction de langage des signes selon la revendication 1 , dans lequel ledit systeme de traduction de 
langage des signes est installe dans chacune des stations d'une pluralite de stations reliees a un reseau local de 
sorte que des informations puissent §tre transferees entre une pluralite desdits systemes de traduction de langage 
des signes. 
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